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APPARATUS FOR USE IN A COMPUTER SYSTEM; 

• TECHNICAL FIELD OF THE INVENTION 

The invention relates to apparatus for use in a 
computer system, and in particular, to apparatus 
5 utilising a bus architecture. 

BACKGROUND TO THE INVENTION 

Figure 1 of the accompanying drawings shows a 
computer system including a typical communications bus 
architecture. A plurality of modules are connected to 
10 a combined read and write bus and to a separate control 

bus, both of which are well known tri- state buses. The 
modules may be, for example, memory devices, graphics 
controllers, CPU's, and so on. The control bus and the 
read/write bus service all the requirements of the 
15 system, enabling the modules to transfer data between 

one another or externally, for example to external 
memory or control devices . 

With the ever increasing need to integrate more 
complexity and functionality into computer systems, and 
20 in particular into systems provided on a single 

integrated circuit, i.e. system-on-chip solutions, bus 
architectures need to be flexible enough to cope with 
the differing needs that individual modules will place 
on the system. 
25 One aspect that a system has to take into 

consideration is "latency". Latency is the amount of 
time that a module has to wait in order to transmit or 
retrieve data. Some modules are more sensitive, or 
intolerant, to this waiting period than others. 
30 Therefore, if latency sensitive, or latency intolerant, 

modules are forced to wait beyond a certain time limit, 
they will behave in a manner that will adversely affect 
the system performance and functionality. 

As more modules are connected to a bus, the size 
3 5 of the bus inevitably needs to be increased. . This in 
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turn can lead to an increased module-to-module 
distance, which increases the time taken to transfer 
data between modules. This can have an adverse effect 
on latency sensitive, or intolerant, modules. 
5 In systems which use a large amount of data that 

must be processed at high speed, for example graphics 
systems, it is important to be able to have efficient, 
high speed data transfer between modules of the system. 
A communications bus is therefore desirable which can 
10 enable different usage of the bus and is able to 

support high speed and high volume of traffic data 
transfer. 

SUMMARY OF THE INVENTION 

The aim of the present invention is to provide an 
15 improved computer system. 

According to the present invention, there is 
provided apparatus for a computer system comprising: 

a plurality of modules; 

a storage device; 
20 means for receiving read transaction requests from 

modules, each request including address data indicating 
the location of the data to' be retrieved, and identity 
data indicating the source of the transaction message; 

means for sending the address data to the storage 

25 device; 

means for storing the identity data in a queue; 
means for receiving a retrieved data item from the 
storage device; 

means for matching the retrieved data item with 
3 0 the identity data at the head of the queue; and 

n^ng for- returni ng the retrieved data to the 

module identified by t-he matched identity data. 

Apparatus embodying the invention may also include 
a bus architecture comprising: 
3 5 a primary bus having latency intolerant modules 
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connected thereto; 

a secondary bus having latency tolerant modules 
connected thereto; and 

a primary to secondary bus interface module 
5 interconnecting the primary and secondary buses. 

The length of the secondary bus will usually be 
greater than that of the primary bus . The primary bus 
is preferably less than, or equal to, three pipeline 
stages in length. 
10 Latency tolerant modules connected to the 

secondary bus can be arranged such that the least 
tolerant modules are located closer to the primary bus 
than the more tolerant modules. 

The bus architecture can have two or more 
15 secondary buses, each one connected to the primary bus 

via a separate primary to secondary interface module. 

The primary and/or secondary bus may comprise: 

a write data bus for transferring write data 
between modules ; 

2 0 a read data bus for transferring read data between 

modules ; 

a transaction bus for transferring control data 
between modules, 

the read data bus and the write data bus being 
25 physically separate from one another. 

With such an architecture, a read transaction by a 
master module can typically involve placing read 
address data indicating the location of the required 
data on the transaction bus to which the master module 

3 0 is connected, and the master module receiving the 

. required read data from a target module on the read 

data bus to which the master module is connected at an 
arbitrary time after the read address has been placed 
on the transaction bus, wherein the write and 
35 transaction buses are available for use during 
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reception by the master module of the return read data. 

The apparatus may further comprise: 

means for assigning each module in the system one 
of a predetermined number of priority levels; 
5 means for assigning each module an initial 

position within a queue; 

means for receiving respective transaction 
requests from modules; 

means for filtering the transaction requests from 
10 the modules so as to retain the requests from those 

modules having the highest priority level of the 
modules making such requests, thereby producing a set 
of filtered requests; 

means for selecting from the set of filtered 
15 requests, the transaction request from the module 

having the highest position within the queue, and 

means for sending a request grant message to the 
module from which the selected transaction request was 
received . 

20 The number of modules will typically exceed the 

predetermined number of priority levels . 

The apparatus preferably also includes means to 
place a module receiving a transaction grant message at 
the bottom of the queue . 

2 5 The apparatus can also include a first arbiter 

means for controlling initiating transactions on the 
bus architecture; and a second arbiter means for 
controlling return transactions on the bus 
architecture. 

3 0 When the primary and/or secondary bus includes 
separate read, write and transaction control buses, 

the first arbiter means preferably controls use of the 
write and transaction buses and the second arbiter 
means preferably controls use of the read bus . 
35 In apparatus embodying the invention, each module 
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can be assigned an address range in a memory map of the 
apparatus, and the apparatus can then further comprise: 

reception means for receiving and storing 
availability data indicative of the availability of 
5 modules ; 

transaction request means for producing a 
transaction request including target address data 
indicating a target location in the memory map for the 
transaction; 

10 decoding means for decoding the target address 

data to produce identity data relating to a target 
module, the target module being assigned an address 
range in the memory map which includes the target 
address data; 

15 comparison means for analysing the stored 

availability data corresponding to the target module 
identified by the identity data; and 

transaction means, responsive to the comparison 
means, for terminating the transaction request if the 

20 analyzed availability data indicates that the target 

module is unavailable. 

Such apparatus can further comprise a control 
means for controlling access to the bus architecture by 
the modules and wherein the transaction means is 

2 5 operable to forward the transaction request -to the 

control means, if the analyzed availability data, 
indicates that the target module is available . 

Apparatus embodying the invention can also 
comprise: 

3 0 means for receiving data requests from modules for 

respective required data packets, each request 
including address data indicating the location of the 
data packet concerned, and sequence data indicating the 
timing of the request relative to other data requests; 
3 5 means for receiving returned data packets and 
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associated sequence data; 

a storage device for storing the returned data 
packets in respective storage locations therein, in the 
order indicated by the associated sequence data; and 
5 retrieval means for retrieving data packets from 

the storage device in the order of the storage 
locations of the storage device. 

The retrieval means can be operable to cycle 
through the storage locations of the storage device. 
10 The storage device can be provided by a dual port 

RAM device . 

Apparatus embodying the invention can also 
comprise: 

an arbitration unit for granting access to the bus 

15 in response to requests received from the modules, the 

granting of access being in the form of a dedicated 
data packet issued from the arbitration unit, whereby 
only the module which has been granted access can use 
that particular dedicated packet to gain access to the 

20 bus, the arbitration unit being operable to issue empty 

data packets during periods when the bus is idle, the 
empty data packets being usable by a module to gain 
access to the bus without making a specific request to 
the arbitration unit for a dedicated packet. 

25 In such apparatus the first module to use the 

empty packet gains access to the bus. Each module may 
have means for converting a dedicated packet intended 
for itself into an empty packet. The arbitration unit 
may be located at one end of the bus. 

3 0 In apparatus embodying the invention the or each 

bus may also comprises: 

a plurality of bus connection units for connecting 
modules to the bus concerned; and 

a plurality of bus portions arranged in series, 

35 each bus portion, except the last in the series, being 
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connected to the next portion in the series by way of a 
bus portion connection unit.. 

Each bus connection unit may include multiplexer 
circuitry for selectively connecting a module to the 
5 bus concerned. Each bus connection unit may include 

output circuitry connected to the bus portions to which 
the unit is connected, the output circuitry being 
optimised for the length of the bus portions concerned. 
Preferably, the bus portions are all substantially 
10 equal in length. 

Embodiments of the invention are usefully 
implemented in a computer system, an integrated 
circuit, a graphics processing system, or a games 
console including apparatus as set out above. 
15 BRIEF DESCRIPTION OF THE DRAWINGS 

For a better understanding of the present 
invention, and to show more clearly how it may be 
carried into effect, reference will now be made, by way 
of example, to the accompanying drawings, in which: - 
2 0 Figure 1 shows an apparatus having a known bus 

architecture for use in a computer system; 

Figure 2 shows an apparatus having a different bus 
architecture for use in a computer system; 

Figure 3 illustrates in more detail the apparatus 
25 of Figure 2; — ' - 

Figure 4 shows an apparatus having a further 
different bus architecture for use in a computer 
system; 

Figure 5 illustrates an apparatus having a further 
30 different bus architecture for use in a computer 

system; 

Figures 6 and 7 illustrate data transfer during 
write transactions in the apparatus of Figure 5; 

Figures 8, 9 and 10 illustrate data transfers in 
35 read transactions in the apparatus of Figure 5; 
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Figure 11 illustrates an apparatus incorporating 
an arbitration unit for use in a computer system; 

Figure 12 is a block diagram illustrating the 
arbitration unit of Figure 11; 
5 Figure 13 is a flow chart illustrating a method of 

bus arbitration for the apparatus of Figure 11; 

Figure 14 illustrates priority level assignments 
used in the scheme of Figure 13; 

Figures 15 to 19 illustrate register entries for 
10 the method of Figure 13; 

Figure 20 illustrates an arbitration unit for an 
apparatus for use in a computer system; 

Figure 21 illustrates transaction implementation 
using the arbitration unit' of Figure 20; 
15 Figure 22 shows an arbitration unit for an 

apparatus for use in a computer system; 

Figure 23 illustrates a bus structure for use with 
the arbitration unit of Figure 22; 

Figure 24 illustrates transaction implementation 
20 using in the arbitration unit of Figure 22; 

Figure 2 5 
apparatus ; 

Figure 2 6 
system; 
25 -figure 2 7 

Figure 2 8 
module ; 

Figure 2 9 
module ; 
30 Figure 30 

Figure 31 

computer system, whereby the apparatus uses idle time 
slots to reduce latency. 

3 5 DETAILED DESCRIPTION 



illustrates an address decoding 

illustrates a memory map of a computer 

illustrates an address decoder; — 
illustrates a representative master 

illustrates representative target 

illustrates a modular bus topology; and 
illustrates an apparatus for use in a 
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Apparatus used in computer systems typically use a 
communications bus architecture, or data pipe to 
interconnect modules of the computer system. In this 
disclosure, a "module" means a functional unit that 
5 connects to the bus, and that provides a particular 

function for the computer system. A module is usually 
able to transfer data to and from other modules in the 
system, and such a transfer of data is known as a 
"transaction". A module that initiates a transaction 

10 is known as a "master module" and a module which is the 

subject of a transaction is known as a "target module". 
In order to initiate a transaction, a master module 
issues a "transaction request" to an arbitration unit 
(or "arbiter") which determines how access to the bus 

15 will be controlled. A "data packet" is defined as a 

predetermined number of data bytes, for example 32, 
which is used for transferring data across the bus 
architecture. A write bus is a bus which is used for 
transferring data packets to be stored at a target 

2 0 module, and a read bus is a bus used for transferring 

so called "mature data" to a master module in response 
to a read transaction. "Mature data" is data which is 
ready to be returned from a target module to a master 
module in response to a read transaction. A "system 
25 Cycle" is defined- as one clock period of the system 

clock, and a "pipeline stage" is defined as the bus 
length travelled by a data pulse in a single system 
cycle . 

It will be readily appreciated that the bus 

3 0 architecture and features described below may be used 

in a computer system including individual components 
connected to one another by external interfaces, or in 
an integrated circuit (on-chip) system in which modules 
are formed on a single integrated circuit, or in a 
3 5 combination of the two. Preferably, such a system 
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involves the use of an integrated circuit (on-chip) 
system. 

PRIMARY AND SECONDARY BUS ARCHITECTURE 

Figure 2 illustrates a computer system including 

5 an improved bus architecture which comprises a primary 

bus 3 and a secondary bus 4 . The primary bus 3 has a 

plurality of modules M x ... M n connected thereto. Each 

module connected to the primary bus 3 is a latency 

intolerant module. 

!0 The secondary bus 4 also has a number of modules 

M t ... M_ , v connected thereto. However, each module 
n+1 n+x 

connected to the secondary bus 4 is a latency tolerant 
module . The primary bus and secondary bus are 
interconnected by an interface module 5, known as a 

15 primary to secondary bus interface module 5 . 

In this manner, modules which are expected to 
operate at relatively high speed, (i.e. the latency 
intolerant, or sensitive modules) , are connected to a 
dedicated primary bus 3 that can service their speed 

20 requirements. Latency intolerant modules cannot wait 

for incoming data, and typically make frequent requests 
for relatively small amounts of data. 

Latency tolerant, or latency insensitive, modules 
are connected to the secondary bus 4 . These modules 

25 are prepared to wait for data, and typically make 

infrequent requests for relatively large amounts of 
data. 

Figure 3 shows a more detailed view of the 
computer system of Figure 2. The primary bus is 
3 0 preferably only one pipeline stage in length. 

Therefore, if two modules are connected by a single 
pipeline stage, a signal will travel between the two 
modules during a single clock cycle. Similarly, if a 
connection between two modules is two pipeline stages 
35 long, two clock cycles will be required for the signal 
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to travel from one module to another. 

Since all the modules connected to the primary bus 
are less than one pipeline stage apart, the primary bus 
guarantees that data transfer between any two modules 
5 on the primary bus is always less than one clock cycle. 

The secondary bus, on the other hand, can be of 
any length, and can be expected to be much longer than 
the primary bus. The secondary bus is made up from a 
plurality of pipeline stages. Thus, the secondary bus 
10 cannot guarantee a minimum transaction time between a 

master and a target module. For example, in Figure 3, 
module M n+X is seven pipelines away from module M n+1 . 
Therefore, data transfer between modules M n+X and M n+1 
will take seven clock cycles. Although Figure 3 shows 
15 the modules separated at intervals of approximately one 

pipeline, they may be several pipeline stage apart, or. 
several modules may be arranged within a single 
pipeline stage. 

The interface module 5 serves as a collation point 
20 for all incoming and outgoing requests to/from the 

secondary bus . The primary to secondary interface 
module 5 has a first-in-first-out register (FIFO) to 
hold outgoing requests before they can be accepted on 
the primary write bus. 
25 . - Preferably, the interface module 5 includes— a FIFO 

capable of holding a fixed number of pending read 
and/or write requests. The FIFO should be sufficiently 
large to hold enough requests to keep the primary bus 
active should the interface module 5 be allowed to make 
3 0 consecutive requests, but small enough such that it is 

not an overhead. Typically, the FIFO will consist of x 
entries of n bits (ie. data and transaction 
information) . 

A modification of such a bus architecture provides 
35 a plurality of secondary buses. Figure 4 shows an 
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arrangement whereby a plurality of secondary buses 4, 
4 1 and 4 2 are connected in parallel. Each secondary 
bus has a corresponding secondary to primary interface 
5, and 5 2 for connecting each secondary bus to the 

5 primary bus 3 . 

The provision of a number of secondary buses in 
parallel is advantageous when there are too many 
modules to connect to a single secondary bus. 

Although the primary bus has been described as 

10 being just one pipeline stage in length, the primary 

bus could still provide the required latency 
sensitivity if it is two, ' or perhaps three pipeline 
stages in length. However, it is unlikely that a 
practical embodiment of a system utilising a split 

15 primary- secondary bus would use a primary bus having 

more than a single pipeline stage. 
SPLIT READ-WRITE BUSES 

In the known bus system of Figure 1 described 
above, the integrated tristate read and write data bus 

2 0 has the disadvantage that a read transaction from a 

master module involves placing the address to be read 
on an address bus, and then> awaiting the data which has 
been read from the target module to return on the 
combined read/write data bus. This wait for the read 

2 5 data to be returned on ^he data bus ties- up the bus 

activity. Other modules wishing to make transactions 
are unable to gain access to the bus. Furthermore, 
when changing from a write transaction to a read 
transaction, (known as turn around) , valuable 
30 processing time is lost. 

Using a split transaction procedure can overcome 
this problem, since a read transaction can adopt a 
"fire -and -forget" approach. This means that the read 
data is returned at some arbitrary time after the read 

3 5 address is issued. However, the combined read and 
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write bus is still tied up while the read data is 
returned. 

To overcome this problem, the read and write bus 
can be split into separate read and write buses. In 
5 this manner, read data is returned on a dedicated bus, 

which means that the write data bus and transaction bus 
are available to carry out further transactions. A 
benefit of this architecture is that the read latency 
does not consume bus time, as happens on a conventional 
10 bus which is frozen while the read cycle is serviced. 

Figure 5 shows a split read/write bus . Each 
module is connected to a write data bus 13 and a read 
data bus 14. A transaction bus 12 controls data 
transfer between any two modules 10, 11. If a data 
15 rate of 3.2 GB/s is required, and the operating 

frequency is 200 MHz, a 128 bit path will be needed to 
sustain this bandwidth. 

This approach removes the need for tri- state 
buses, since each module is "tapped" as a simple 
2 0 connection, or by way of a multiplexer which drives the 

segment of bus up to the next module. This makes the 
entire structure point to point with no tri-state 
drivers being required. The transaction bus 12 is 
provided for information pertaining to the read and 
2 5 write transactions, for example address traffic. Thi-s 

bus runs in parallel with the write data bus 13, and 
flows in the same direction. The transaction bus 12 
can initiate a transaction every clock cycle at 200 MHz 
to fit the natural data size of the external 
30 interfaces. A single transaction consists of a packet 

of typically 32 data bytes. Larger transfers are 
always split into packet transactions which are 
independently arbitrated. The data bus width is chosen 
to be typically 128 bits (or 16 bytes) . This means 
35 that a transaction takes two cycles to process on the 
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read or write data buses. 

Since read transactions can be initiated every 
second cycle, and write transactions likewise, the 
transaction bus 12 can alternate initiations of read 
and write transactions on each alternate cycle, and 
keep both data buses saturated. In this way, such a 
bus structure can support, for example, up to 3.2 GB/s 
of read traffic and up to 3.2 GB/s of write traffic 
simultaneously. In addition, consecutive read 
transactions are also allowed, in the absence of any 
write transactions requests. 

Figure 6 shows data transfer on the write data bus 
13 where data is transferred as one packet. In one 
cycle, t, information pertaining to the write data 
15 (i.e. address, master ID tag, master sequence tag) W Ta 

is placed on the transaction bus 12 and the write data 
W Da is placed on the write data pipe. 

Figure 7 shows an alternative whereby the transfer 
of data takes place over two cycles. In cycle t, the 
2 0 write address information W Ta is placed on the 

transaction bus 12 and the first half of the data is 
placed on the write data bus 13. During cycle t + 1, the 
second half of the write data packet is placed on the 
write data bus 13 . This leaves an idle slot on the 
25 transaction bus 12, whiclr -could be used to request a 

read packet . 

Figure 8 shows that when a read request packet is 
made, the transaction bus 12 conveys the information 



R. 



Ta 



regarding the data-fetch (address etc.). During 



3 0 such a request, the write data bus 13 is empty. 

The read data pipe 14 is used by target modules to 

send data to the master modules in response to Read 
requests as shown in Figure 8. Figure 9 shows how the 
read data bus can convey the data from the target to 

35 the master in one clock cycle, t. 
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Alternatively, if only a half-size data bus is 
used, the read data is conveyed over two cycles, t and 
t + 1 . 

An advantage of partitioning the bus into separate 
5 transaction, write and read buses, is that it is 

possible to reduce latency problems in a latency 
sensitive environment. Furthermore, there is no need 
to rely on tri-state schemes that are complex to 
control . 

10 ARBITRATION SCHEME FOR BUS ACCESS 

In a computer system such as those described 
above, which include more than one module, some form of 
arbitration is required to decide which of the modules 
is to be allowed access to the communications bus at 
15 any one time. 

Figure 11 illustrates such a computer system, in 
which five modules Ml to M5 are connected to a 
communications bus 20 and also to an arbitration unit 
(or arbiter) 21. It will be appreciated that the 

2 0 communications bus is preferably in accordance with 

those described with reference to Figures 2 to 10 
above, but that the arbitration scheme is applicable to 
any bus architecture. 

The arbitration unit 21 communicates via control 
25 lines 22 with the modules Ml to M5 , and the modules 

communicate with the bus 20 via links 23. In general 
terms, a transaction is completed as follows: the 
arbitration unit 21 receives a transaction request from 
a module (the "master" module for the transaction) , and 

3 0 if the bus is available to that module, then the 
arbitration unit 21 returns a transaction grant 

message. The master module concerned then places data 
on to the bus. For example, using the split 
read/write/transaction bus described above, if module 
35 Ml is to undertake a write transaction, it requests use 
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of the write bus by sending a write transaction request 
via control lines 22 to the arbitration unit 21. The 
arbitration unit 21 decides whether the master module 
Ml is able to use the write bus, in accordance with the 
5 method to be explained in more detail below. If the 

master module Ml is able to use the write bus, the 
arbitration unit 21 issues a transaction grant message 
to the master module Ml and a transaction granted 
message to the target module. Upon receipt of the 

10 grant message,, the master module Ml supplies address 

data on the transaction bus and corresponding write 
data on the write bus . The target module then receives 
the address and write data from the bus and processes 
the data appropriately. 

15 Figure 12 is a block diagram illustrating an 

arbitration unit 21 for use in the computer system of 
Figure 11. The arbitration unit 21 includes a control 
means 24 for carrying out the arbitration scheme, a 
request receive means 25 for receiving transaction 

2 0 requests from modules in the computer system, priority 

level storage means 27 for storing information 
indicating an assigned priority level for each of the 
modules in the system, and stack storage means 2 6 for 
storing stack (or queue) information relating to the 

2 5 modules in the system. 

An arbitration scheme will now be described with 
reference to the flow chart of Figure 13 and the 
register charts of Figures 14 to 21. The arbitration 
scheme to be described can be used for the read and 

3 0 write buses together, or individually. The result of 
the arbitration scheme determines which of the modules 

Ml to M5 is granted use of the bus concerned. 

Each module Ml to M5 is assigned a relative 
priority level . In the example shown in Figure 14 , 
3 5 modules Ml and M4 are assigned a high priority level, 
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modules M2 and M5 a medium priority level and module M3 
a low priority level. The priority levels are stored 
in the priority level storage means 27 in the 
arbitration unit 21. 
5 The initial set up of the arbitration scheme is to 

arrange the modules into initial stack positions (step 
A) which are stored in the stack storage means 26. 
These initial positions are illustrated in Figure 15. 
It will be appreciated that the initial stack positions 

10 are arbitrarily chosen. In the example shown, Ml is at 

the top of the stack and M5 at the bottom. 

In step B, the arbitration unit 21 receives 
respective transaction requests from any number of the 
modules Ml to M5 . For example, all five modules may 

15 wish to make transaction requests at the same time. 

The transaction requests are received by the request 
receive means 25 and are forwarded to the control means 
24. At step C, the control means 24 determines which 
of the transaction requests are associated with the 

20 highest priority level of the modules issuing requests. 

In the example where all five modules Ml to M5 issue 
transaction requests, modules Ml and M4 can be seen to 
have the highest priority levels of the requesting 
modules. The control means then masks out (step E) all 

25 requests from lower priority modules. Thus, only the 

requests from modules Ml and M4 are processed further. 
This is illustrated in Figure 16. 

The second stage of the arbitration scheme is to 
determine the stack positions (step F) of the modules 

3 0 whose requests are to be processed further. The 

transaction request from the module occupying the 

highest, position in the stack is selected for . ... 

acceptance (step G) . 

This is illustrated in Figure 17 in which it can 

3 5 be seen that the module Ml is at a higher position in 
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the stack than the module M4 , and so the request from 
module Ml is accepted over that from module M4 . In 
step H, a transaction grant message is then sent to the 
selected module (Ml in this example) so that the module 
5 can use the bus in the prescribed manner (step I) . 

The entry in the stack relating to module Ml is 
then moved (step J) to the bottom of the stack, and the 
entries corresponding to the remaining module M2 , M3 , 
M4 and M5 move up the stack as illustrated in Figure 

10 18. The arbitration scheme is then used again with the 

new stack, position when the next request or requests 
are received from the modules. 

For example, as shown in Figure 18, if modules M2 , 
M3 and M5 issue transaction requests together, the 

15 first stage of the arbitration scheme selects M2 and 

MS, since these two modules have the highest assigned 
priority level of the requesting modules. The second 
stage of the arbitration then selects the request from 
module M2 since module M2 is at a higher position in 

2 0 the stack than module M5 . Module M2 is then moved to 

the bottom of the stack, resulting in the stack shown 
in Figure 19, with module M3 at the top of the stack 
followed by M4 , M5 and Ml with the module M2 at the 
bottom. 

25 In the particular bus architecture described 

above, using primary and secondary buses and split 
read, write - and transaction buses, the primary write 
bus arbitration scheme as described functions over two 
clock cycles which allows sufficient time to gather 

3 0 transaction requests from master modules, to arbitrate 
as described, and to issue transaction grant message to 

the successful master modul.e and its target module . 

In, for example, a graphics system, there can be 
five possible master modules which can require use of 
3 5 the write data bus. These are the processor, the 
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graphics controller, the interface controller between 
the primary and secondary bus system, the advanced 
graphic port, the PC component interconnect and the 
memory, ie . RAM bus connection. There are three 
5 possible target modules, the PC component interconnect, 

the advanced graphic port and channels 0 and 1 of the 
memory, ie . RAM. In such a system, all masters except 
the processor have entries in the stack system in order 
to maintain the fairness during arbitration. The 
10 processor is assigned the highest priority of all of 

the master modules making requests, and therefore need 
not be included in the stack, since whenever the 
processor makes a request for bus usage, the bus is 
assigned to the processor. 
15 Preferably, the priority level assigned to each of 

the modules in the stack can be individually programmed 
(either during design, or by the user) , for example by 
assigning an appropriate two-bit code for each module. 
For example, a two bit coding scheme provides three 
20 priority levels; low, medium and high; as well as a 

disabled level in which no requests are granted to the 
module concerned. 

When a master module wishes to make a request, it 
transmits its request signal together with the 
2 5 information concerning the target module and whether a 

read or write transaction is to be performed. As 
described above, incoming requests are assessed firstly 
according to their incoming priority level and secondly 
according to their position within the stack. It is 
30 therefore possible to maintain an order of fairness 

when granting the requests . 

.. DUAL ARBITRATION UNITS - DUAL TRANSACTION ARBITERS - 
A system can be implemented using an arbitration 
unit which contains two separate arbiters. Figure 2 0 
35 illustrates an arbitration unit 31 for use in a 
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computer system using separate transaction, write and 
read buses. The arbitration unit 31 is linked to the 
modules (not shown) of the system in order to control 
access to the bus. Requests to the arbitration unit 
5 and grant signals are carried by control lines 34. 

The arbitration unit 31 contains an initiation 
arbiter 32 and a retirement arbiter 33. Each of the 
arbiters within the arbitration unit are connected to 
the modules. It will be appreciated that many modules 
10 can be connected to a single arbitration unit. 

The initiation arbiter 32 is concerned with 
initiating (or launching) read and write transactions 
on the transaction bus or on the transaction and write 
data buses, respectively. For a read transaction, the 
15 initiation arbiter 32 will grant use of the transaction 

bus to a master requesting read data. The use can be 
granted on the basis of the arbitration method 
described above, or on any other basis. 

A write transaction, requires simultaneous use of 
20 both the transaction and write buses for a single 

cycle . 

The retirement arbiter 1 deals with read data when 
it arrives back from a target module. Multiple targets 
may mature read data in the same cycle, and so 

25 arbitration of the read data bus is controlled 

separately from the transaction and write data bus. 
The retirement arbiter 33 receives transaction requests 
from target modules holding mature data and assigns use 
of the read bus in an appropriate way, for example as 

3 0 described above. 

The read data will take time to "mature" at the 

target and then, appears or. the read data bus at some 
indeterminate, time in the future after the read request 
has been made . 
35 Since the two arbiters 3 2 and 3 3 operate 
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independently , the transaction bus can be fully- 
utilised, by using the vacant clock cycle associated 
with a write data transaction to issue a read 
transaction on the transaction bus. . This is 
5 illustrated in Figure 21. The transaction bus is 

ideally alternated between read and write states so 
that the write data bus is also fully utilised carrying 
the two packet write data packets . 

The initiation arbiter is able to receive requests 

10 and grant bus use within a single clock cycle. When a 

module is granted the use of the transaction bus the 
transaction address data is placed on the transaction 
bus, and, for a write transaction, write data is placed 
on the write bus. On the next clock cycle, a read 

15 transaction can be initiated . from the same module or 

from a different module. This means that transactions 
must not be initiated to targets that cannot respond to 
them on that cycle or the transaction will fail. A 
method is described in more detail below, in which it 

2 0 is possible to determine whether a target is able to 

accept a transaction, before the transaction is 
requested. 

In addition, multiple modules may request read 
data from a single target. Therefore each maturing 
25 data packet must be retired (returned) back to the 

master which requested it. To do this, each target 
maintains a list of outstanding :read transactions. 
Preferably this is implemented in the form of a first - 
in-first-out (FIFO) register having a size equal to the 

3 0 maximum number of read transactions that can be 
processed. Identification data tags can be used and 

associated with the requests so that the returning data 
can be identified. Such a system will be described in 
more detail below. 
3 5 When read data matures in a target, it requests 
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the retirement arbiter for use of the read data pipe 
line. When the use is granted, it outputs both the 
data packet and the master ID tag from the transaction 
data store. The retirement arbiter asserts a strobe to 
5 the master unit which originally requested the data so 

that master unit can consume that data. An important 
assumption in an ideal system is that the master is 
able to consume immediately any data packet which it 
has requested. 

10 Arbitration for the read bus can be similar to 

that for the write bus but can be simplified if each 
master module is able to accept the matured read data 
as soon as it is made available. A similar two-stage 
filter and stack arbitration system can be used to 

15 arbitrate between the various sources of read response 

data, and no interaction is required from the write or 
transaction buses. The read data transfer is achieved 
over two cycles (two half packets) . Once again, the 
priorities could be programmable, but in a preferred 

2 0 example, the priorities are fixed. When mature data is 

ready at a target, that target module indicates to the 
read arbiter that it wishes' to send data to a 
particular master module. Use of the read bus is then 
controlled by the retirement arbiter. 
25 DUAL ARBITERS FOR SPLIT R EAD /WRITE BUSES 

An alternative arbitration unit includes separate 
write bus and read bus arbiters, for example as 
illustrated in Figure 22. In such a case, the bus 
preferably includes two transaction buses -one for read 

3 0 and one for write. Such a system is illustrated in 
T^-ig,,^ si wh P re the transaction bus TW 43 is associated 

with the write -data bus W 44 and the read transaction 
bus TR 46 is associated with the read data bus R 45. 
As illustrated in Figure 24, since the read and write 
3 5 data buses 44 and 4 5 and the read and write transaction 

HL70718 



-23- 

buses 43 and 46 are entirely separate, then it is 
possible to perform simultaneous independent read and 
write operations. 
ADDRESS DECODING 
5 As described above, in a computer architecture 

employing a bus system and multiple modules connected 
to that bus system, some form of arbitration is 
required to determine which of the modules can have 
access to the bus system. Preferably, the computer 

10 system is defined by a memory map in which respective 

address ranges are allocated to individual modules. In 
such a system, each module can address the other 
modules simply by using a single address value. Thus, 
if module M3 wishes to write data to a particular 

15 address, it simply issues address data equivalent to 

that address. This address data needs to be decoded to 
determine the target module identity. Preferably, each 
module Ml, M2 and M3 supplies information to all the 
other modules indicating when it is busy (i.e. 

20 unavailable for transactions) . 

An example of a suitable decoding arrangement 58 
is shown in Figure 25 and comprises a transaction 
request generator 53, for producing transaction 
requests, an address decoder 50, a module availability 

25 reception and storage means 51, and an 

address/availability comparison means 52. The 
operation of these. units will be described with 
reference to Figures 25, 26 and 27. Figure 26 
illustrates the memory map space for the computer 

3 0 system and it shows the address regions assigned to 

modules Ml and M2 . The address regions are bounded by 

specific addresses MIA and M1B for module 1, and M2A 
and M2B for module 2 . 

When module M3 wishes to perform a transaction 

3 5 with module Ml, for example, module M3 generates a 
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transaction request including target address data. The 
address decoder means 50 receives the target address 
data. The address decoder 22 operates to determine 
which of the modules Ml and M2 is- assigned the address 
5 region into which the target address data falls. 

A specific implementation of the address decoder 
is shown in Figure 2 7 and will be described in more 
detail below. 

The result of the address decoding is supplied to 

10 the address/availability comparison means 52. The 

module availability reception means 51 receives and 
stores information indicating whether the modules Ml, 
M2 and M3 are busy or able to receive, a transaction 
request. The address/availability comparison means 

15 uses the decoded address information and the relevant 

stored module availability data to determine whether 
the intended target is able to accept the transaction. 
If the availability information corresponding to the 
decoded address signal indicates that the module 

20 concerned is unavailable, then the requested 

transaction is halted until it is available. However, 
if the signal indicates that the module concerned is 
available to receive a transaction, then the 
transaction request is forwarded to the arbitration 

25 unit 

In this way, the arbitration unit can only be 
provided with requests that are made for available 
modules . 

Figure 27 illustrates one possible implementation 
30 of the address decoder 50. The decoder 50 includes 

registers 61 and 62 for storing respectively the upper 
and lower boundary address values of a particular 
module, In the examples shown, the upper value MIA of 
module Ml is stored in register 61 and its lower value 
35 M1B is stored in register 62. A comparator 63 compares 
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the upper value with the incoming address 6 0 and 
outputs a high signal if the address is less than the 
value given by MIA. Similarly, a comparator 64 outputs 
a high signal if the incoming address 60 is greater 
5 than or equal to the lower value M1B . The outputs of 

the comparators 63 and 64 are combined by an AND gate 
65 such that if the incoming address is greater than or 
equal to the lower value M1B and less than the upper 
value MIA, then a high output is supplied at 66. If, 

10 however, the address signal indicates that the required 

address is outside of this range, then the output 66 is 
low. Thus, only the selected 66 is high. 

The decoding arrangement can thus effectively 
provide pre-arbitration stage, which only allows 

15 arbitration of those transaction requests which are 

most likely to be successful if granted. Any 
transaction request which specifies a target which is 
already known to be busy, it will simply not be 
forwarded to the arbitration unit. 

2 0 In an alternative arrangement to the above, 

availability information could be sent to an 
arbitration unit, and used within the arbitration 
process in granting access to the bus (that is, the 
module availability reception and storage means 51 and 

2 5 the address/availar>ility comparison means 52 shown in 

Figure 25 could be located within an arbitration * unit ) . 

DATA PACKET REORDERING 

A complication can arise if a master module 

3 0 requests read data from two or more targets that 

^ supports bit transactions. In such a case there ijs the 

.. possibili-ty that data may mature from the targets in a 
different order from that in which it was requested. 
The master module could then receive data packets in 
3 5 the wrong order. This problem, as will be described in 
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more detail below, can be solved by providing 
transaction tag register in each. target so that module 
ID code and a transaction sequence code can be utilised 
for that module. A sequence code is generated by each 
5 module from, for example, an end bit counter, which is 

incremented whenever the module concerned is granted a 
read transaction. Thus, when read data is matured in a 
target and returned (or retired) back to the requesting 
master, it has an associated sequence code transmitted 

10 as well. This means that the master can re-order the 

data packets as they arrive. 

An advantage of such a scheme is that it is 
potentially deadlock free, since the only blocking 
condition is when a target is full and cannot accept 

15 further transactions. This condition only affects 

those masters wishing to access that particular target 
and will clear as the target processes its transaction 
queue. Read data can always be retired since the 
requesting masters are defined as always being able to 

2 0 accept data they have requested. 

A master that makes requests for read data from 
more than one target within* the overall latency time of 
any of those targets may well receive data packets in 
the wrong order. Since the master must accept data 

2 5 packets regardless of the order in which it receives — 

them, some method is required to re-order the data 
packets in the correct order for consumption by the 
function provided by the master. 

Figure 2 8 shows representationally a master module 

3 0 MN, which is connected to a bus 70, and which provides 

a module function 79. The master module MN requests 
data using a transaction request inducing transaction 
address data supplied on the transaction bus by a 
transaction output request stage 74. The transaction 
35 request also includes a transaction sequence tag which 
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is produced by a sequence producer counter 71. This 
sequence tag indicates the relative order in which the 
transaction has been produced. When read data is 
received, via an input 76, that read data packet has a 
5 read packet sequence tag associated with it which is 

received by an input 77. The read sequence tag, which 
is equivalent to the transaction sequence tag output by 
the master with its transaction request, is used to 
indicate where the read data packet should be stored 

10 within a two port memory, ie . RAM, buffer 75. The read 

data packet 7 6 is input via the memory write port and 
is written at a position within the memory indicated by 
the sequence tag 77. 

The consumer counter 72 provides a signal for 

15 controlling the output from the RAM 75. The data 

packets are read from the RAM via its read port in 
strict rotation. The consumed data 78 is thereby 
provided to the module function 79. 

The RAM re-order buffer is treated as a circular 

20 queue, which is addressed by the sequence tags. 

Packets arriving on the bus are written to a RAM slot 
indicated by the input sequence tag associated with the 
packet, whilst packets are read from the buffer in 
strict order. The consumption of data by the master 

25 will stall during a~v eriod of waiting for the next 

expected data packet . The arrival of data packets 
associated with later transactions are ignored- The 
number of transactions that can be awaited in this way 
is determined by the number of addresses in the dual 

3 0 port RAM. 

TARGET MODULE DATA TAG QUEUES 

A similar function can be provided at a target 
module so that the data provided by the target module 
in response to a read request is output from the module 

35 in the order in which the transactions are received by 
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the target module. This can be used in addition to the 
data packet re-ordering scheme mentioned above, or 
independently of such a method. 

As shown in Figure 29, a representative target 
5 module having a target module function 8 6 is connected 

to a bus 80 for transferring data packets to and from 
requesting master modules (not shown) . A transaction 
request is received by the transaction input line 81 
and is stored in a transaction first in first out 

10 (FIFO) register 83. If the register 83 is full, then a 

full flag is returned via a control line 82 to the 
requesting module or arbitration unit (not shown) . 

The first in first out FIFO register 83 supplies 
address data to the target function 86, whilst 

15 identification and sequence data is passed to a further 

first in first out FIFO buffer 87. This buffer 87 
provides a tag queue, and is used so that, master module 
identification and sequence data can be recombined with 
data packets read by the target function 86. When a 

2 0 data packet to be output to the master is returned by 

the target function 86, then the associated tag (ID and 
sequence data) is output onto the bus at the same time, 
thereby identifying the output read data. The tags are 
output using control lines 8 8 and the read data packet 

2^ output on the line 89. 

If the transaction concerned is a write 
transaction, where write data is to be written to che 
target function 86, then the address and write data is 
passed to the target function 86 from the transaction 

30 FIFO 83, but the ID and sequence data is not 

transferred to the tag queue, since no return data is 
required in that case. 

' It will be readily appreciated that the use of 
first-in-first-out (FIFO) registers 83 and 87 provides 

3 5 a method of ordering incoming transactions and outgoing 
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data . 

MODULAR TOPOLOGY 

In all the bus architectures described above, the 
common aspect is that a number of modules Ml, M2 and M3 
5 etci are connected to a single bus. It is therefore 

desirable to provide a scheme in which modules can be 
easily attached to the bus architecture, without a 
corresponding change in bus properties. 

Figure 30 illustrates a modular bus architecture 

10 topology. Three modules are shown in the example of 

Figure 30, but any number of modules could be connected 
in this way. A bus 90 is punctuated by a number of bus 
connection modules 91-^ 91 2 and 91 3 which are 
preferably, but not necessarily, equally spaced along 

15 the bus 90. Each of the connection modules 92 1 , 92 2 or 

92 3 allows a module Ml, M2 or M3 to be connected to the 
bus . 

The bus connection modules 91 1 , 91 2 and 91 3 are 
typically controlled such that only one of the modules 

2 0 Ml, M2 or M3 is connected to the bus at any one time. 

This is achieved by the connection modules 91 being 
controlled by respective inputs 93 l7 93 2 and 93 3 from 
the bus architecture arbitration unit (not shown) . 

Each of the bus connection modules 91 includes 
25 buffering and output circuitry which can be tailored 

for the particular length of bus between it and the 
next connection module. Thus, the signal 
characteristics along the bus can be optimised since 
the load on each of the buffers from the 

3 0 interconnection modules is known. A particular 

advantage of this system is that if, for example, 
module M2 is not required in a particular application, 
then it can be simply omitted from the design, but the 
connection module 91 2 enables the bus characteristics 
35 to be maintained along the . length of the bus. 
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In the split bus examples described earlier, using 
split read/write and transaction buses, each of the 
buses can be treated in the same way illustrated in 
Figure 30. More particularly, the schematic diagram 
.5 shown in Figure 3 0 is applicable to each of the read, 

write and transaction buses individually as well as 
appropriate combinations thereof. In a particular 
example, the write and transaction buses will be 
connected to modules in the manner shown in Figure 30, 

10 but the read bus could be hard wired and permanently 

connected to each of the modules. This could be made 
possible by associating identity data with the incoming 
read data so that each of the modules is able to 
identify the data packets intended for its consumption 

15 without need for reference to the arbitration unit. 

This is a result of the split read/write buses being 
able to maintain separate read and write transactions 
on the bus architecture. 
SECONDARY BUS IDLE USAGE 

2 0 A further improvement may be made to the latency 

of the secondary bus by utilising the time when the bus 
is idle. 

Referring to Figure 31, the secondary arbitration 
unit is normally located at the end of the secondary 
25 bus 4. When a master unit, " (for example module M n+X ) , 

wishes to make a transaction, a request is sent to the 
arbitration unit. The arbitration unit decides whether 
to grant the request according to the arbitration 
techniques described above. Since module M n+X is 

3 0 located near the arbitration unit, this request for 

making a transaction is serviced in a small number of 
clock cycles. 

However, if module M n+1 wishes to make a 
transaction, the request must also be sent to the 
35 arbitration unit, which may be many pipeline lengths 
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away. As a consequence, a request from module M n+1 

takes longer to service than a request from module 

M 
z n+x 

To overcome this problem, whenever the bus is 
5 idle, the arbitration unit places "empty" packets on 

the bus at every available clock cycle. The empty 
packets travel along the bus, and may be used by any 
module that wishes to make a transaction. A module can 
then grab a free packet rather than requesting a 

10 dedicated packet and awaiting its return. 

If module M n+2 makes a request for a transaction, 
but in the meantime receives an empty packet from the 
arbitration unit, it is free to use the empty packet 
rather than await for its requested packet to return. 

15 However, when the requested packet eventually returns 

at a later time, module M n+2 can then convert this 
packet which it no longer needs into an empty packet. 
This converted packet can then continue along the bus, 
and may be used by another module wishing to make a 

20 transaction. 

Using the idle bus time in this manner enables the 
average transaction request-, time to be significantly 
reduced . 

The apparatus described above may be used in many 
25 applications, for example, personal computers, laptops, - 

microprocessors, microcomputers, graphics systems, 
Simultaneous Instruction Multiple Data (SIMD) 
applications, parallel processing, set- top boxes 
(satellite decoders and digital television decoders) , 
3 0 or consumer appliances. 
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CLAIMS 

1. Apparatus for a computer system comprising: 
a plurality of modules; 

a storage device; 

means for receiving read transaction requests from 
modules, each request including address data indicating 
the location of the data to be retrieved, and identity 
data indicating the source of the transaction message; 

means for sending the address data to the storage 
device ; 

means for storing the identity data in a queue ; 

means for receiving a retrieved data item from the 
storage device; 

means for matching the retrieved data item with 
the identity data at the head of the queue; and 

means for returning the retrieved data to the 
module identified by the matched identity data. 

2 . A computer system comprising apparatus as 
claimed in claim 1. 

3. An integrated circuit comprising apparatus as 

claimed in claim" 1. 

4. A graphics processing system comprising 
apparatus as claimed in claim 1 . 

5 . A games console comprising apparatus as 
claimed in claim 1. 
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